EXPLAIN Your SQL Query Plan

您所在的位置：网站首页 › sql a-b › EXPLAIN Your SQL Query Plan

EXPLAIN Your SQL Query Plan

#EXPLAIN Your SQL Query Plan | 来源: 网络整理| 查看: 265

QuestDB's core engineer, Bolek Ziobrowski, has implemented the EXPLAIN SQL keyword. In this series of articles, he is going to shed light on how to fine tune your SQL queries to improve performance, with the help of EXPLAIN. Bolek starts the first post of the series with his journey in database optimization and an introduction to the EXPLAIN execution plan.

Photo by Braydon Anderson via UnsplashOptimization: a constant predicament#

One of the first tasks I got when I started working as a Java software developer was about optimization: optimization for nightly processes of an online reservation system, a traditional three-tiered Java web application backed by then-popular RDBMS. A story like many others - processes that started fast and nimble became slow and resource-hungry over the years, up to a point where they were running late, overlapped with daily load, or just errored out.

Not good.

Quick code reconnaissance showed that the said batch processes relied heavily on the database. They ran many custom multi-page-long SQLs one after another and exported some results to data files. Checking logs revealed that some took seconds while others dragged on for tens of minutes. Profiling JVM only showed that the application is waiting on queries. System logs weren't more useful than that - they showed lots of IO and CPU load but didn't give any hint as to why.

It seems the only option is to speed up queries.

Right, but how?

At first, I had no idea. I simply took the slowest query and started making changes here and there. Most changes didn't improve the query speed, and when they did, it turned out that the query was broken. Applying good advice from Internet forums didn't help at all. After a few hours, I gave up and tried asking local database gurus for help. To my surprise, the suggestions I received were repeating the 'Internet wisdom', e.g.:

"Use hint X.""UNION is slow, rewrite it to UNION ALL.""You have to use index, because full table scan is slow."

While still trying to apply any advice I could find and feeling tired, I discovered a thick tome on the company's IT bookshelf - an RDBMS textbook. That's where I found the right tool for the job - the EXPLAIN command.

Behind the database: the basics#

Initially, EXPLAIN seemed like some black art - hard to follow and full of arcane terms. Before it really clicked with me, I had to go through the book and learn some fundamental concepts. That's why, before going any further, let's focus on basic stuff for a moment.

QuestDB processes queries in the following phases :

Parsing - Query text is parsed and a model is created.Optimization - The query is rewritten for the best performance.Planning - An execution plan is created, and the best table scan and join methods are chosen.Execution - Tables are accessed according to the execution plan and outcomes are produced.

SQL is just the query interface and the database must decide on a great many things before it can produce results. These things include:

How to access a table - e.g. full or index scan, scan forward or backward.How to join tables - e.g. nested loop, merge, or hash join.How to order joins - e.g. 5 tables can be joined in 5! = 120 different orders.How to optimize expressions - e.g. evaluate constant expression once instead of for every row.

The above is just the tip of the iceberg, a lot is happening under the hood and many factors influence the end performance of a query.

EXPLAIN explained#

The EXPLAIN SQL keyword shows the execution plan (the outcome of step 3. above), which is the logical algorithm used to implement a given statement and thus 'explains' its performance. Even though it's not part of the SQL standard, many databases, such as PostgreSQL, MySQL, and Oracle, do implement their own variant of EXPLAIN. Contrary to SQL, which is a high-level declarative language, EXPLAIN uses terms that are often low-level, and related to the database implementation, for example, hash joins or index scan. Learning the fundamental QuestDB concepts is essential to fully understand how queries work and how to improve their performance.

Thanks to this type of tool, I was able to first, understand what happens under the hood when a query is run, and why it happens. Furthermore, I was able to reduce the nightly process run time well beyond what was expected.

Time for practice: example 1#

Now, it's time for practice. Let's have a look at the following queries that use the trades table in the QuestDB demo instance:

CREATE TABLE trades ( symbol SYMBOL CAPACITY 256 CACHE, side SYMBOL CAPACITY 256 CACHE, price DOUBLE, amount DOUBLE, timestamp TIMESTAMP) TIMESTAMP (timestamp) PARTITION BY DAY;

Query A:

SELECT count(*) FROM trades WHERE price = 0.0f;

Query B:

SELECT count(*) FROM trades WHERE price = 0.0;

For hot data, response times are :

Query A: 160msQuery B: 50ms

Can you guess why?

Photo by Nguyen Dang Hoang Nhu via Unsplash

It could be a number of things, but why guess when you can check:

EXPLAIN SELECT count(*) FROM trades WHERE price = 0.0f;QUERY PLANCount (4)聽聽聽聽Async Filter (3)聽聽聽聽聽聽filter: price=0.0f聽聽聽聽聽聽聽聽DataFrame聽聽聽聽聽聽聽聽聽聽聽聽Row forward scan (2)聽聽聽聽聽聽聽聽聽聽聽聽Frame forward scan on: trades (1)

The output is a tree of nodes - stages in query processing - that have properties and/or subnodes (aka child nodes).

In the plan above we have the following nodes and properties:

nodes - They start with an upper-case letter and have bigger indents than properties:CountAsync FilterDataFrameRow forward scanFrame forward scanproperties:price=0.0fon: trades

According to the plan, the query contains the following steps:

Forward scan all the partitions of the table.Forward scan all rows in each partition.Apply filters to each row.Count all rows that pass it.

Next, we check the plan for query B:

EXPLAIN SELECT count(*) FROM trades WHERE price = 0.0;QUERY PLANCount聽聽聽聽Async JIT Filter聽聽聽聽聽聽filter: price=0.0d聽聽聽聽聽聽聽聽DataFrame聽聽聽聽聽聽聽聽聽聽聽聽Row forward scan聽聽聽聽聽聽聽聽聽聽聽聽Frame forward scan on: trades

Can you spot the difference?

It's about the JIT.

The second query is faster because it uses Just-In-Time-compiled filter!

The first query doesn't use it due to column and argument type mismatch.

In both cases, the trades table is fully scanned, which is reflected by DataFrame and the child nodes.

note

The QuestDB Web Console limits the number of output rows to 1k, so it is not the best way to measure the performance of queries that return data sets bigger than the limit.

Time for practice: example 2#

Let's look at the pos, another table available in the QuestDB demo instance:

CREATE TABLE pos ( time TIMESTAMP, id SYMBOL CAPACITY 256 CACHE INDEX CAPACITY 512, lat DOUBLE, lon DOUBLE, geo6 GEOHASH(6c), geo12 GEOHASH(12c)) TIMESTAMP (time) PARTITION BY DAY;

What if we need to count the number of records with a given id prefix?

SELECT count(*)FROM posWHERE substring(id,1,21) = 'NSTTQFPKISYEGDLPJFEKP'

The result is 678 and it takes about 1.2s.

A bit slow for just a few hundred records.

Let's used EXPLAIN on the query and inspect the plan:

QUERY PLANCount (1)聽聽聽聽Async Filter (2)聽聽聽聽聽聽filter: substring(id,1,21)='NSTTQFPKISYEGDLPJFEKPS'聽聽聽聽聽聽workers: 14聽聽聽聽聽聽聽聽DataFrame (3)聽聽聽聽聽聽聽聽聽聽聽聽Row forward scan (3)聽聽聽聽聽聽聽聽聽聽聽聽Frame forward scan on: pos (3)

I have annotated the plan with some numbers to illustrate the information. From the table, we can see that query counts (1 in the table) records filtered by an asynchronous, multi-threaded filter (marked as 2), which scans the whole table (marked as 3).

Considering the number of records (678 rows in the result vs 120M in the whole table) it seems strange that the index on id wasn't used.

What can we do here?

In many databases, index access requires a query predicate to match the expression used in the index definition.

If a mismatch is detected, the index would need to be fully scanned, which is much more costly than a single value or value range lookup.

In this case, we have a symbol column, so assuming that id contains A-Z letters only and is 22-characters long, we can rewrite the expression with a IN clause:

SELECT count(*)FROM posWHERE id IN ('NSTTQFPKISYEGDLPJFEKPA','NSTTQFPKISYEGDLPJFEKPB' ... 'NSTTQFPKISYEGDLPJFEKPZ');

In effect, the response time shrinks to 3ms, about 400 times less.

Let's check out why:

EXPLAIN SELECT count(*)FROM posWHERE id IN ('NSTTQFPKISYEGDLPJFEKPA','NSTTQFPKISYEGDLPJFEKPB' ... 'NSTTQFPKISYEGDLPJFEKPZ');QUERY PLANCount聽聽聽聽FilterOnValues聽聽聽聽聽聽聽聽Cursor-order scan聽聽聽聽聽聽聽聽聽聽聽聽Index forward scan on: id deferred: true聽聽聽聽聽聽聽聽聽聽聽聽聽聽filter: id='NSTTQFPKISYEGDLPJFEKPA'聽聽聽聽聽聽聽聽聽聽聽聽...聽聽聽聽聽聽聽聽聽聽聽聽Index forward scan on: id deferred: true聽聽聽聽聽聽聽聽聽聽聽聽聽聽filter: id='NSTTQFPKISYEGDLPJFEKPZ'聽聽聽聽聽聽聽聽Frame forward scan on: pos

This time, instead of fully scanning the table, QuestDB decides to scan the index :

First, it finds id='NSTTQFPKISYEGDLPJFEKPA' key.

The engine iterates over the list of associated row ids and accesses the pos table.

It locates the 'NSTTQFPKISYEGDLPJFEKPB' key, etc.

If we had to query by one complete value, like this one:

SELECT count(*)FROM posWHERE id IN ('NSTTQFPKISYEGDLPJFEKPS')

The plan would be even simpler and more effective:

QUERY PLANCount聽聽聽聽DeferredSingleSymbolFilterDataFrame聽聽聽聽聽聽聽聽Index forward scan on: id deferred: true聽聽聽聽聽聽聽聽聽聽filter: id='NSTTQFPKISYEGDLPJFEKPS'聽聽聽聽聽聽聽聽Frame forward scan on: pos

It returns results in about 0.5 ms.

note

The rule-based optimizer used in QuestDB considers index access only for simple symbol_column = constant | function() expressions.

Let's get back to the trades table. Say we'd like to count the number of trades in the last 24 hours :

SELECT count(*)FROM tradesWHERE timestamp > dateadd('d', -1 ,systimestamp());

It works fine but why does it take a second?

Photo by Camylla Battani via Unsplash

Let's see :

EXPLAIN SELECT count(*)FROM tradesWHERE timestamp > dateadd('d', -1 ,systimestamp());

Plan:

QUERY PLANCount聽聽聽聽Async Filter聽聽聽聽聽聽filter: dateadd('d',-1,systimestamp())

【本文地址】

公司简介

联系我们